Estimation of Fuzzy Error Matrix Accuracy Measures Under Stratified Random Sampling
نویسندگان
چکیده
A fuzzy error matrix may be used to summarize accuracy assessment information when both the map and reference data are labelled using a soft classification. Accuracy measures analogous to the familiar overall, user’s, and producer’s accuracies of a hard classification can be derived from a fuzzy error matrix. The formulas for estimating the fuzzy error matrix and accompanying accuracy measures depend on the sampling design used to collect the reference data. We derive these estimation formulas for stratified random sampling, a design commonly implemented in practice. A simulation study is conducted to confirm the validity of the stratified sampling estimators. Introduction Digital image classification is frequently employed to produce land-cover maps from a variety of aircraftand satellite-based sensors. Classification may be performed on a per-pixel basis employing a crisp or hard classification, or on a sub-pixel basis using a fuzzy or soft classification (Schowengerdt, 1997). For a hard classification, each pixel of the image is assigned to a single class. In contrast, a soft classification assigns to each pixel a degree or grade of membership in each of the land-cover classes. Accuracy assessment is an integral part of the classification process. Accuracy assessments for maps employing a hard classification are typically based on an error matrix and associated summary measures such as overall, user’s, and producer’s accuracies derived from the error matrix (Foody, 2002). No such standard approach is in common use when the classification is soft, and methods are needed to extend the notion of hard-class matching to soft-class matching to produce statements that can represent uncertainty expressed in reference and classified data (Binaghi et al., 1999, p. 936). A soft classification could be hardened so that the analyses applicable to a hard classification could be used, but often this entails an unacceptable loss of information. Several accuracy measures have been proposed specifically for use with soft classification. These include Estimation of Fuzzy Error Matrix Accuracy Measures Under Stratified Random Sampling Stephen V. Stehman, Manoj K. Arora, Teerasit Kasetkasem, and Pramod K. Varshney entropy (Finn, 1993; Maselli et al., 1994), cross-entropy (Foody, 1995), Euclidean and L1 distances (Foody and Arora, 1996), correlation (Maselli et al., 1996), Morisita’s index (Ricotta, 2004), and measures analogous to overall, user’s, and producer’s accuracies derived from a fuzzy error matrix (Binaghi et al., 1999). We focus on Binaghi et al.’s (1999) error matrix formulation because this places accuracy assessment of a soft classification into a context familiar to users already experienced with accuracy assessment of a hard classification. Whether the classification is hard or soft, accuracy assessment requires selecting a sample of locations and determining the reference or ground condition at these sample locations. The accuracy measures are then estimated from the reference sample data. Adhering to the principle of consistent estimation requires that the estimation formulas take into account the sampling design used to collect the reference data (Stehman and Czaplewski, 1998). Stratified random sampling is employed in accuracy assessment to ensure that rare classes are allocated large enough sample sizes to precisely estimate accuracy of each class. Because stratified random sampling is commonly implemented in practice, it is important to use estimation formulas appropriate for this design. Applying formulas derived for simple random sampling (SRS) to data obtained by a stratified sampling design will generally produce biased estimates. We derive the formulas appropriate for stratified random sampling to estimate the fuzzy error matrix accuracy measures suggested by Binaghi et al. (1999). We also derive the formulas for the standard errors of these accuracy estimates. Fuzzy Error Matrix An error matrix summarizes the correspondence between the map labels assigned to the pixels and the corresponding ground condition (i.e., reference class) labels observed from existing maps, aerial photographs, ground surveys, or video images. The columns of the error matrix represent the reference class, and the rows represent the map class. Diagonal elements of the error matrix represent agreement between the map and reference labels, and the off-diagonal elements reflect disagreements between the map and reference labels. Binaghi et al. (1999) demonstrated how a fuzzy error matrix could be produced for an assessment where a soft classification is employed for both the map and reference data, and noted that this “fuzzy error matrix performs as precisely as the corresponding traditional matrix and is a clear generalization of the latter.” PHOTOGRAMMETRIC ENGINEER ING & REMOTE SENS ING Feb r ua r y 2007 165 Stephen V. Stehman is at SUNY College of Environmental Science & Forestry, 320 Bray Hall, Syracuse, NY 13210 ([email protected]). Manoj K. Arora is with Geomatics Engineering, Department of Civil Engineering, Indian Institute of Technology Roorkee, Roorkee, 247667 India. Teerasit Kasetkasem is at the Electrical Engineering Department, Kasetsart University, Jatujak, Bangkok, Thailand 10900. Pramod K. Varshney is at the Electrical Engineering and Computer Science Department, Syracuse University, 335 Link Hall, Syracuse, NY 13244. Photogrammetric Engineering & Remote Sensing Vol. 73, No. 2, February 2007, pp. 165–173. 0099-1112/07/7302–0165/$3.00/0 © 2007 American Society for Photogrammetry and Remote Sensing 05-081 1/11/06 3:09 AM Page 165
منابع مشابه
A Study on the Accuracy and Precision of Estimation of the Number, Basal Area and Standing Trees Volume per Hectare Using of some Sampling Methods in Forests of NavAsalem
The present study aimed to investigate the accuracy and precision estimation of the number, basal area and volume of the standing trees by methods of random and systematic random sampling in the forests of West Guilan. The cost or inventory time was determined using the criteria (E%2 × T). Inventory was carried out by complete sampling (census) in an area of 52 hectares. The study area (sect...
متن کاملClassifier Risk Estimation under Limited Labeling Resources
In this paper we propose strategies for estimating performance of a classifier when labels cannot be obtained for the whole test set. The number of test instances which can be labeled is very small compared to the whole test data size. The goal then is to obtain a precise estimate of classifier performance using as little labeling resource as possible. Specifically, we try to answer, how to sel...
متن کاملAccuracy Assessment Method for Wetland Object-based Classification
The object-based classification approach needs an adaptation of the traditional accuracy assessment methodology. This paper presents a methodology that considers the inherent variability within each wetland class for the sampling size calculation based on objects and a stratified random sampling to select samples for each category. The polygons selected for validation are overlaid on the data u...
متن کاملEstimation of population mean in the presence of measurement error and non response under stratified random sampling
In the present paper we propose an improved class of estimators in the presence of measurement error and non-response under stratified random sampling for estimating the finite population mean. The theoretical and numerical studies reveal that the proposed class of estimators performs better than other existing estimators.
متن کاملImproved Exponential Estimator in Stratified Random Sampling
In this article we have considered the problem of estimating the population mean Y in the stratified random sampling using the information of an auxiliary variable x which is correlated with y and suggested improved exponential ratio estimators in the stratified random sampling. The mean square error (MSE) equations for the proposed estimators have been derived and it is shown that the prop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006